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Abstract 



5 We study planar point location in a collection of disjoint fat regions, and investigate the complexity of local 
updates: replacing any region by a different region that is "similar" to the original region, (i.e., the size 
differs by at most a constant factor, and distance between the two regions is a constant times that size), 
i— i 8 We show that it is possible to create a linear size data structure that allows for insertions, deletions, and 

9 queries in logarithmic time, and allows for local updates in sub-logarithmic time on a pointer machine, 
r \ 10 We begin by describing a solution for the 1-dimensional version of the problem, where we can achieve 

11 constant time local updates. Then we show how the ideas can be extended to 2 dimensions. 

CO 

O 12 We show that given constant similarity and fatness parameters: 



• A set of n disjoint intervals in E 1 can be maintained in an O(ii) size data structure that supports 
O(logn) worst-case time insertion, deletion, and point location queries, and O(l) worst-case time 
local updates (Section^. The data structure can be implemented on a real- valued pointer-machine. 

• A set of n disjoint fat regions in R 2 can be maintained in an 0(n) size data structure that supports 
O(logn) worst-case time insertion, deletion and point location queries, and O(loglogn) worst-case 
time local updates (Section [4j|. The data structure can be implemented on a real- valued pointer- 

19 machine. 



• We also give bounds that can handle arbitrary similarity and fatness parameters in Theorem |3.2 



and Theorem 4.7 for the R 1 and R 2 case respectively. 
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1 Introduction 

Planar point location lies at the heart of many geometric problems, and has been a major research 
topic in computational geometry for the past 40 years. In the static version of the problem, one 
aims to store a subdivision of the plane such that given a query point q in the plane, the cell of the 
subdivision containing q can be retrieved quickly [25l [26l |46j |47l 62 . In the dynamic version of 
the problem, one also allows changes to the data set, typically adding or removing line segments 
7 to the subdivision EE HQ GO L53] . 

The best known dynamic data structures on a real RAM are due to Cheng and Janardan [TS] . 
who achieve C(log 2 ri) queries and O(logn) updates where n is the size of the subdivision, and 

10 Arge et al. [5], who achieve O(logn) queries, 0(log 1+e n) insertions, and 0(log 2+£ n) deletions. A 

11 central open problem in this area is whether a linear-size data structure exists that can support 

12 both queries and updates in logarithmic time, although this is known to be possible in more 

13 specific settings such as monotone or rectilinear subdivisions [T21 L3H EH] . Husfeldt et al. [H] prove 

14 that even in the very strong cell probe model, there are f2(logn/loglogrj) lower bounds on both 

15 queries and updates. 

16 Despite these theoretical results, practical evidence suggests that updating a data structure should 

17 be fast. Intuitively, an update to a data set should not need to depend on n at all, unless we need 

18 to find the place where the update takes place (i.e., we need to do a point location query). In 

19 this paper, we study point location data structures on a collection of fat objects in the plane that 

20 support local updates: replace any region by a different region that is similar to the original. We 

21 show that the lower bounds on updates can be broken in this setting, while still allowing O(logn) 

22 queries and using 0(n) storage. 

23 The idea of local updates is not new. For example, Nekrich [SS] considers (on a word-RAM) the 

24 local update operation insertA^, y) which inserts a new element x into a 1-dimensional sorted list, 

25 given a pointer to an existing element y that satisfies \x — y\ < A for some distance parameter A. 

26 There is also a related concept called finger updates, where the position of the update is known; 

27 see e.g. Fleischer [31] . However, our results are the first in this area that work in a geometric 
setting, and they can be implemented on a real- valued pointer machine. (See Appendix [A] for a 

29 discussion of computation models.) 

30 In order to obtain our results, we develop several tools which we believe are interesting in their own 

31 right, such as a dynamic balanced compressed quadtree with worst-case constant time updates, 

32 and a tree decomposition that supports logarithmic searches and constant time local changes (see 



33 Section 2.2| 



34 1.1 Problem description 

35 We define the problem in general dimension d, but restrict our attention to d G {1,2} in the 

36 remainder of this paper. We use \R\ to denote the diameter of a region R c R d , that is, \R\ = 
maxp^gjj \pq\. We say two /a^regions R\,Ra C R d are p-similar if |i?i U Ra\ < pmin{|i?i|, |i?2|}, 
see Figure [lpl 

39 Problem 1.1: Given a set 1Z of n disjoint fat regions in Mr, store them in a data structure that 

40 allows: 

• queries: given a point q G M , return the region in TZ that contains q (if any) in Q(n) time; 



H We formally define fat regions in Section |3] 

™ This definition captures two ideas at once: firstly, the sizes of Ri and R2 can differ by at most a factor of p, 
and secondly, the distance between Ri and R2 can be at most a factor p times the smaller of these sizes. 
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• local updates: given a region R <E 1Z and a region R' that is p-similar to R, replace R by R' 
in the data structure in U (n) time; and 

• global updates: delete an existing region R. from the data structure or insert a new region 
R' into the data structure in Q(n) + U(n) time 

such that Q(n) — O(logn) but U{n) — o(logra). Note that a local update allows for an arbitrary 
number of smaller regions to be "between" the old region R and the new region R' . 



1.2 Applications 

Tracking moving objects. A natural application of our data structure is to keep track of moving 
objects. One may imagine a number of objects of different sizes moving unpredictably in an 

10 environment at different speeds. A popular method for dealing with moving objects is to discretize 

11 time and process the new locations of the objects at each time step. The naive way to do this 

12 is to simply rebuild an entire data structure every time step. Our data structure can be used to 

13 process such changes more efficiently. 

14 In computational geometry, there is a large literature on dealing with moving objects (or points). 

15 Kinetic data structures are based on the premise that a data structure should not need to be 

16 updated each time step, but rather only when some combinatorial feature of a description of 

17 objects changes [TJ [351 • A fundamental underlying assumption in kinetic data structures 

18 is that trajectories of the moving objects are predictable, at least in the short term. However, 

19 in many modern real-world scenarios, trajectories are not predetermined, they are discovered in 

20 an online and inherently discrete fashion. As a result, several theoretical approaches to deal with 

21 unpredictable motion have been suggested recently, in various settings [HI [23J 28, 33, 55, 67] . A 

22 common assumption in these works is to bound the maximum displacement after each update 

23 (or velocity) of the moving points. An interesting feature of our data structure is that we can 

24 simultaneously maintain objects moving at very different scales, with a velocity bound that is 

25 dependent on the size of the object. 

26 Data imprecision. A different motivation for studying this problem comes from the desire to 

27 cope with data imprecision. One way to model an imprecise point is to keep track of a region of 

28 possible locations of the point [3H1 LSZ] (see also [25] and the references therein) . Recently, there has 

29 been a lot of activity in this area [T71 [2U1 H31 ES] • Although algorithms to deal with imprecise data 

30 are beginning to be well understood in a static setting, little effort has been devoted to dealing 

31 with dynamic imprecise points. However, in many settings imprecision is inherently dynamic (e.g. 

32 time-dependent or "stale" data), or explicitly made dynamic (e.g. updates from new samples of 

33 the same point). 

34 One of the simplest geometric queries on a data structure that stores a point set one can imagine is 

35 the identity query. Given a query point, is there a point in the data structure that is equal to the 
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query point? When the points in the data structure are imprecise, the answer to this question may 
have three possible values: "certainly", "possibly", or "certainly not." Distinguishing between the 
second and last answe^ comes down to testing whether the query point (which we assume is a 
precise point) is contained in any of the uncertainty regions of the imprecise points. Therefore, 
5 we may view the problem as a dynamic point location problem in a set of changing regions. 

If we only wish to support increased precision updates (which would correspond to stationary, 
but imprecise points), this question is closely related to existing work in the update complexity 
model [151 132] , in which one attempts to minimize the number (or amount of gained precision) 
of updates necessary to correctly output some structure; and to work on preprocessing imprecise 

10 points [HI [24j HOI [50l [65], in which one tries to prepare a set of imprecise points for faster 

11 computation of some structure on the precise points once they become available. While these 

12 results do not analyze the time complexity of single updates, they do provide some evidence that 

13 sub-logarithmic update time may be possible. 

14 1.3 Solution outline 

15 Geometric data structures are often either based on a recursive decomposition of the data (e.g. 

16 a binary search tree) or a recursive decomposition of space (e.g. a quadtree). Neither of those 

17 techniques by themselves are strong enough to solve the problem at hand, so our solution combines 

18 both techniques. We base our solution on a dynamic balancecj^] compressed quadtree 60 , the 

19 details of which are covered in Section [2j However, the quadtree is not built on the regions 

20 directly. Rather, for each region R G 1Z, we store a representative point rn that lies somehow "in 

21 the middle" of R. We build search structures over the quadtree which allow us to quickly locate 

22 the quadtree cells containing relevant data. We answer point-location queries by locating the 

23 smallest quadtree cell containing the query point and then searching the quadtree bottom-up for 

24 regions which intersect this cell. This approach allows us to handle input described by arbitrary 

25 real numbers and to operate mostly on abstract combinatorial objects. We only require basic 

26 operations on our input: compare two numbers, and find a bounding box around a small set of 
points (see Appendix [A] for more details). 

28 We first illustrate the main ideas of our approach in the simpler one-dimensional version of the 

29 problem, in which we do not need any additional search structure once we have located the correct 
quadtree cell. In Section [3j we show how the dynamic balanced quadtree achieves worst-case 

31 constant time local updates and logarithmic point location queries for intervals in R 1 . In Section[4] 

32 we solve the more complex two-dimensional problem, in which we require more sophisticated search 

33 structures. We "mark" a small number of carefully chosen quadtree cells near the representative 

34 point, and show how to adapt a marked- ancestor data structure to find the relevant regions once 

35 we have located the correct quadtree cell. By leveraging the marked-ancestor tree and edge-oracle 

36 tree described in Section [2] we are able to support queries in O(logn) time and local updates in 

37 O(loglogn) time. We can also support insertions and deletions as the composition of a query and 

38 local update. 

Our search structures require the assumptions made in Section |1.1| when we defined local updates. 

40 That is, we assume that the regions are fat and disjoint. Realistic input models are intended 

41 for designing algorithms that are provably efficient in practice, and the fat-and-disjoint model is 

42 ubiquitous (see e.g. [HI [371 E9 and citations therein). Note that the fat-and-disjoint model is 

43 not a direct requirement of the quadtree, as the quadtree only stores the representative points of 

44 regions. Rather, we leverage the model in order to bound the number of directions from which a 

45 region may overlap the query cell, and thus facilitate fast queries in the marked-ancestor structure. 



Under the mild assumption that all points have at least some imprecision and there are only finitely many 
points, the first answer will never occur. 
ED The quadtree is balanced in a geometric sense, but may still have linear depth. See Section 2 
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Thus we achieve our goal of maintaining a data structure with query time Q(n) = 0(logn) but 
local update time U(n) = o(logn). Note that for planar point location in rectilinear subdivisions 

3 Q{n) = O(logn) and U(n) = O(logn) can be achieved on a RAM by using the complex data 
structures of Blelloch or Giora and Kaplan [M] and removing R and re-inserting R' . However, 

r > we show that changing a region locally is more efficient than naively removing a region and inserting 
a new one. Iacono and Langerman |42j also give a solution which achieves O (log TV) query time 
and 0(1) update time if the regions are restricted to be disjoint axis-aligned fat hyper-rectangles 
with coordinates drawn from a fixed universe [N]. However, in our solution we are able to achieve 
sub-logarithmic local updates without requiring that the regions be axis-aligned, rectangular, or 

10 limited precision. Moreover, our solution works on a real-valued pointer machine and does not 

11 require hashing, bit-level manipulation, or even the floor operation (see Appendix [C]) . 

12 2 Tools 

13 Before attacking the dynamic point location problem, we review several known and new concepts, 

14 techniques, data structures and notation that will help us. 

15 2.1 Preliminaries 

16 Quadtrees. Let B be an axis-aligned square]^] A quadtree T on B is a hierarchical decomposition 

17 of B into smaller axis-aligned squares called quadtree cells. Each node v of T has an associated 

18 cell C v C R d , and v is either a leaf or has 2 d equal-sized children whose cells subdivide C v [2"Tll3"Ul 

19 [39l [61]. We denote the parent of a node v by v. A pair of cells are called neighbors if they are 

20 interior disjoint and meet at an edge or corner. A leaf v is a-balanced if a|C„| > |C„| for every 

21 larger neighbor C u of C v . We say T is a-balanced if every leaf in T is a-balanced. If a is a small 

22 constant (e.g., 2 or 4), then we simply call the quadtree T balanced. 

23 Let P C R d be a set of n points contained in B. We say T is a valid quadtree for P if every leaf 

24 of T contains at most 1 point of P. We will be maintaining a valid quadtree for a certain set P, 

25 and require that the points and leaves that contain them are always connected by bidirectional 

26 pointers. It is known that quadtrees may have unbounded depth if P has unbounded spread]^] so 

27 in order to give any theoretical guarantees the concept is usually refined. Given a large constant 

28 a, an a- compressed quadtree is a quadtree with additional compressed nodes. A compressed node 

29 v has only one child v with \C$\ < \C v \/a and such that C v \ C v has no points from Pj®] In the 

30 remainder, we assume for simplicity of exposition that v is aligned with v, that is, if we keep 
subdividing C v we will eventually create C v ^ 

32 The compressed nodes of a quadtree T cut the tree into a number of components that correspond 

33 to smaller regular (uncompressed) quadtrees. We say T is a-balanced if all these smaller trees are 

34 a-balanced. It follows directly from Theorem 1 of Bern et al. [TU], that a balanced compressed 

35 quadtree of linear complexity exists for any set of points P. 

36 Static edge-oracle trees. Let T be an abstract tree of size \T\ with constant maximum degree 

37 d. Suppose that the nodes in the tree are given unique labels, and suppose that each edge e e T 

38 has an oracle which for any node label x can answer the following question: "If we removed e such 



^ We use the term square to mean a cZ-dimensional hypercube, since our main focus is on d = 2. 
^ The spread of a point set P is the ratio between the largest and the smallest distance between any two distinct 
points in P. 

Such nodes are also often called cZitsier-nodes in the literature [101 1111 ITS] . 
El While this assumption is realistic in practice, on a pure real-valued pointer machine it is not possible to align 
compressed nodes of arbitrary size difference in constant time. In Section |C.1| we show how to adapt the results 
to unaligned compressed nodes. 



Loffler et al., Dynamic Planar Point Location with Sub-Logarithmic Local Updates 



5 



that T is split into two components, which component would contain the node labeled :c?" The 

2 edge-oracle tree is a search structure built over the edges of T which allows us to navigate from 
any node ueTto any other node v G T in 0(log \T\) time and examines only C(log |T|) edges. 

4 We can construct an edge-oracle tree for T by recursively locating an edge which divides T into 
two components of approximately equal size. 

The static version of this structure is similar to the well known centroid-decomposition method 
for building a logarithmic height search structure over an unbalanced tree. In fact, Arya et al. [6] 

3 used a similar technique to support point location in a quadtree, but only considered the static 
9 setting. 

10 Local updates. For a one-dimensional ordered list, data structures that can handle local (finger) 

11 updates are well known. One of the simplest implementations on a pointer machine is due to 

12 Fleischer [2]. 

13 Marked-ancestor problem. Suppose we are given a simple path where some nodes in the path 

14 can be marked, and we want to support the following query for any node x: "Which is the first 

15 marked node which comes after node x in the path?" and we also want to support updates where 

16 nodes can be marked or unmarked and inserted into or deleted from the path. This is known 

17 as the marked successor problem. A natural generalization of this problem is to extend support 

18 from paths to any rooted tree. Now the query we must support is "Which is the lowest marked 

19 ancestor of x in the tree?" . This is known as the marked- ancestor problem. As in the marked 

20 successor problem, we also want to support updates, in which nodes are marked or unmarked, and 

21 insertions/deletions of nodes to/from the tree. Alstrup et al. [H [3] gave the following results for 

22 the marked-ancestor problem on a word-RAM. 

23 Lemma 2.1: We can maintain a data structure over any rooted tree T which supports insertions 

24 and deletions of leaves in O(l) amortized time, marking and unmarking nodes in O(loglogn) 

25 worst-case time, and lowest marked ancestor queries in 0(log?i/loglogn,) worst-case time. 

2G 2.2 New Tools 

27 We show how to maintain a dynamic balanced compressed quadtree and a dynamic edge-oracle 

- tree which supports local updates. 

29 Dynamic balanced quadtrees. A dynamic quadtree is a data structure that maintains a quadtree 

30 Q on a point set P under insertion and deletion of points. In order to maintain a valid quadtree 

31 of linear size, we respond with split and merge operations respectively. A split operation takes a 

32 leaf v of Q and adds 2 d children to it; a merge operation takes 2 d leaves with a common parent 

33 and removes them. Clearly, split and merge can be made to run in O(l) time for quadtrees, since 

34 we are given the location in the quadtree where these operations are applied. 

35 In a dynamic compressed quadtree, we must consider the case where the node v being split is 

36 a compressed node. In this case, v gets 2 d new children, and v needs to be connected to the 

37 correct child. If the size factor is now less than a, this child gets further subdivided until the 

38 two components are merged. A merge operation does the opposite. These operations can still be 

39 implemented in O(l) time. 

40 In a balanced quadtree, after a split operation the balance may be disturbed, and we require 

41 additional cells to be split to restore the balance. This operation may take 0(n) time in the 

42 worst case, if we want to maintain 2-balance, because the rebalancing operation can "cascade". 

43 However, if we only perform split operations, then we can maintain 4-balance in 0(1) worst-case 

44 time per split. 
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(a) 



(b) 



Fig. 2: (a) A 2-balanced quadtree on a set of points. True cells are shown in black, iJ-cells in yellow, (b) An 
insertion could cause a linear "cascade" of cells needing to be split if we want to maintain 2-balance. 
Therefore, we only split the direct neighbors (purple), which may now be only i-balanced. 



Lemma 2.2: We can maintain 4-balance in a dynamic compressed quadtree in 0(1) worst-case 
2 time per update. 

Proof: (sketch) We call a quadtree cell true if its parent contains at least two points of P and it 
would therefore be present in any valid unbalanced quadtree, and we call a quadtree cell a S-cell 



otherwise (i.e., it was only added to maintain quadtree balance). Figure 2(a) shows an example. 
G We will maintain the property that each true cell is 2-balanced with respect to its larger neighbors, 
7 and every _B-cell is 4-balanced with respect to its larger neighbours. 

Let C be a true quadtree cell which is 2-balanced with respect to its neighbors. When we split C, 
we examine the 3 d - 1 neighbors of C, and we split a larger neighbor C" if the children of C are not 

10 2-balanced with respect to C . Thus we restore 2-balance to C at the cost of potentially inserting 

11 some U-cells which are only 4-balanced, see Figure [2(b) However, it takes two operations to 



12 split a _B-cell. First, we must insert a point into the £>-cell, which does not require a split since the 

13 cell was already split to maintain balance. This changes the cell from a B-ce\\ to a true cell. We 

14 also spend a constant amount of time examining each of the 0(1) neighbors of the newly true cell, 

15 and splitting them if necessary so that the cell is now 2-balanced with respect to its neighbors. 

16 We may be splitting a compressed node. Recall that if the size factor between a compressed node 

17 v and it's child v is less than a, then we continue to split v a constant number of times until the 

18 two components "grow together" . This case only requires a constant number of additional splits, 

19 and each split can be handled in worst-case 0(1) time as before. We maintain balance in the tree 

20 rooted at v up to the level of v, which ensures that no nodes more than a constant factor smaller 

21 than v are on the outside, and only 0(1) work needs to be done to rebalance the tree. 

22 When we delete a p from a cell C, we restore the quadtree to what it would be had p never been 

23 inserted, essentially "undoing" the insertion of p. Since the original splitting and balancing only 

24 took O(l) time, it clearly only takes 0(1) to undo that splitting and balancing. If C was a B-cell, 

25 there is no change. If C was a true cell, and its parent C has smaller neighbors which would 

26 become unbalanced if we merge C, then C may remain split and C becomes a £>-cell. Otherwise, 
we merge C . 

28 ® 



29 Dynamic edge-oracle trees. There have been several recent results which generalize classic one- 

30 dimensional dynamic structures to a multidimensional setting by combining classic techniques with 

31 a quadtree-style space decomposition. For example, the skip-quadtree [29] combines the quadtree 
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(a) (b) 



Fig. 3: A not necessarily balanced or rooted abstract tree T (a) and its corresponding edge-oracle tree (b). 
Labels on edges of T match up with the label of the corresponding node in the edge-oracle tree. In the 
final structure we maintain the edge-oracle tree as a modified (a, b)-tree; small subtrees are maintained 
as buckets (linked-lists) to facilitate fast updates. 



and a skip-list, the quadtreap [SB] combines a quadtree and a treap, and the splay quadtree 
combines a quadtree with a splay tree [5J5]. However, surprisingly there are no multidimensional 
3 data structures which incorporate finger searching techniques, i.e. structures that are able to 
support both logarithmic queries and worst-case constant time local updates on a quadtree. In 
the following we show how to build a dynamic edge-oracle tree which combines tree-decomposition 
and finger searching techniques with a quadtree to support 0(log n) queries and 0(1) local updates. 

Lemma 2.3: If v is a leaf in an unweighted free tree T, then the edge incident to v has height O(l) 
in the corresponding edge-oracle tree. 

Proof: Recall that we construct the static edge-oracle tree for T by recursively locating an edge 

10 which divides T into two components of approximately equal size. Thus the edges are split in 

11 order to maintain a balanced number of edges in each subtree of the edge-oracle tree. Since the 

12 edge adjacent to a leaf has edges to one side of the split and at least one edge on the other side 

13 of the split, these edges will not be chosen for splitting by the algorithm until there are no other 

14 edge choices in the sub-tree. M 

15 Lemma 2.4: Let T be a tree subject to dynamic insertions and deletions of leaves. We can maintain 

16 an edge-oracle tree over T in 0(1) worst case time per local update. 

17 Proof: An insertion or deletion of a leaf and its associated edge in T corresponds to an insertion 

18 or deletion of a node in the edge-oracle tree. Since the location of the node is known, and the 

19 height of the node is 0(1), we can borrow techniques from Fleischer [3T] to perform updates in 

20 O(l) time. The techniques are surprisingly simple, and we only sketch them here. We maintain 

21 the edge-oracle tree as an (a, 6)-tree. However, we collapse each subtree of size 0(logn) into a 

22 single pseudo-node called a bucket. The original nodes within the bucket are maintained in a 

23 simple linked-list. When performing a query, we locate the correct bucket and iterate through 

24 the list for the correct original node in O(logn) time. Given a pointer to an original node, an 

25 update is simply a O(l) linked-list operation. If many nodes are inserted into the same bucket, 

26 then a bucket may become too large. However, Fleischer shows how to distribute the rebuilding of 

27 buckets over later updates, only spending 0(1) time per update, such that the size of each bucket 

28 never deviates significantly from Q (log n). M 

29 Lemma 2.5: In a quadtree, an edge-oracle can be simulated in 0(1) time. 
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Proof: In a quadtree, we are searching for the quadtree leaf which contains a query point q. Each 
2 edge in a quadtree goes between a child cell and a parent cell that contains it. If the child cell 
contains the query point, then the leaf must be the child cell or one of its descendants, and the 
oracle returns the corresponding component of the quadtree. Otherwise, the oracle returns the 
other component of a quadtree. Since each quadtree cell is aware of its bounding box, we can 
(i compare the query point with the child cell and return our answer in constant time. 

Lemma 2.6: Let P be a set of n points, and Q be a balanced and compressed quadtree on P. We 
can maintain P and Q in a data structure that supports O(logn) point location queries in Q, and 
local insertions and deletions of points in P (i.e., when given the corresponding cells of Q) in 0(1) 
10 time. 



11 Proof: By Lemmas 2.4 and 2.5 we can maintain an edge-oracle tree over the compressed quadtree 

12 which can hnd the unique quadtree cell containing a query point in O(logn) time and respond to 

13 local updates in the quadtree in 0(1) time. M 

14 Marked-ancestor trees. We show how to answer marked-ancestor queries on a pointer-machine. 



15 Details are given in Section C.2 



16 Lemma 2.7: We can maintain a data structure over any rooted tree T which supports insertions 

17 and deletions of leaves in O(l) amortized time, marking and unmarking nodes in O(loglogn) 

18 worst-case time, and queries for the lowest marked ancestor in O(logra) worst-case time. All 
operations are supported on a pointer machine. 



20 3 One-Dimensional Case 

21 To aid our exposition, we first present a solution to the one-dimensional version of the problem. 

22 Our data structure illustrates the key ideas of our approach while being significantly simpler than 

23 the two-dimensional version. Note that in M 1 , our input set 1Z of geometric regions is a set of 

24 non-overlapping intervals. The difficulty of the problem comes from the fact that a local update 

25 may replace any interval by another interval of similar size at a distance related to that size; 

26 hence, it may "jump" over an arbitrary number of smaller intervals. Our solution works on a pure 

27 Real-valued pointer machine, and achieves constant time updates. 

28 3.1 Definition of the data structure 

29 Our data structure consists of two trees. The first is designed to facilitate efficient updates and 

30 the second is designed to facilitate efficient queries. The update tree is a compressed quadtree 

31 on the center points of the intervals; The quadtree stores a pointer to each interval in the leaf 

32 that contains its center point. We also augment the tree with level-links, so that each cell has a 

33 pointer to its adjacent cells of the same size (if they exist), and maintain balance in the quadtree 

34 as described in Lemma |2.2| The leaves of the quadtree induce a linear size subdivision of the 

35 real line; the query tree is a search tree over this subdivision^] that allows for fast point location 

36 and constant time local updates. We also maintain pointers between the leaves of the two trees, 

37 so that when we perform a point location query in the query tree, we also get a pointer to the 

38 corresponding cell in the quadtree, and given any leaf in the quadtree, we have a pointer to the 

39 corresponding leaf in the query tree. Figure [4] illustrates the data structure. 

El Although we could technically also use a search tree directly on the original intervals, we prefer to see it as a 
tree over the leaves of the quadtree tree in preparation for the situation in R . 
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♦ 



Fig. 4: A set of disjoint intervals and their center points (red); a compressed quadtree on the center points 
(blue); and a search tree on the leaves (or parts of internal cells not covered by children) of the quadtree 
(green). 



Lemma 3.1: Let / € TZ be an interval, and let I' be another interval that is 0(/j)-similar to I. 
Suppose we are given a quadtree storing the midpoints of the intervals in TZ and a pointer to the 
leaf containing the midpoint of /. Then we can find the leaf which contains the midpoint of /' in 
4 O(logp) time. 

Proof: Let C be the quadtree leaf cell which contains the center point of /, and let C be the 
quadtree cell which contains the center point of /'. Observe that I is at most four times as large 
as C: otherwise, / would completely cover the parent C of C, but then no other intervals could 
have their center points in C to cause C to be split. Similarly, I' is at most four times as large as 
9 the new quadtree cell C . Therefore, the distance between I and I' is proportional to the size of 



10 C (and C). Since we maintain balance in the quadtree according to Lemma 2.2 we can find C 

11 from C by following 0(1) level-link and parent-child pointers in the quadtree. H 

12 3.2 Handling queries 

13 In a query, we are given a point q and must return the interval in TZ that contains q. We search 

14 in the query tree to find the quadtree leaf cell which contains q and its two neighboring cells in 

15 O(logn) time. Any interval / which overlaps q must have its center point in one of these three 

16 cells (otherwise, there would be an empty cell between the cell containing q and the cell containing 

17 the center point of I). We compare q with the intervals stored at these cells (if any) to find the 

18 unique interval that contains q or report that there is no containing interval in O(l) time. Thus 

19 the total time required by a query is O(logn). 

20 3.3 Handling updates 

21 In an update, we are given a pointer to an interval I G TZ, and a new interval /' that should 

22 replace /. We follow pointers in the quadtree to find the new cell which contains the center point. 



23 If I and I' are 0(l)-similar, Lemma 3.1 implies that the new cell is at most a constant number of 

24 cells away, and we find the correct cell in 0(1) time. Then we remove the center point from the 

25 old cell and insert it into the new cell, performing any compression or decompression required in 

26 the quadtree. This only requires a constant number of pointer changes in the quadtree and can 

27 be done in 0(1) worst-case time, and we may also need to restore balance to the quadtree, which 



28 requires 0(1) worst-case time by Lemma 2.2 Finally, we follow pointers from the quadtree to the 

29 query tree, and perform the corresponding deletion and insertion in that tree, which by Lemma |2.6| 

30 takes only constant time. Thus, the entire update can be completed in O(l) worst-case time. 

31 Note that we can also insert or delete intervals from the data structure in O(logn) time; we 
perform a query to locate where the interval belongs and a local update to insert it or remove it. 
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Theorem 3.2: We can maintain a linear size data structure over a set of n non-overlapping intervals 
2 such that we can perform point location queries and insertion and deletion of intervals in O(logn) 
worst-case time and local updates in O(logp) worst-case time. 

4 Two-Dimensional Case 

■5 We now focus our attention on disjoint fat regions in the plane. Intuitively, a fat region should 
not have any long skinny pieces. We consider two types of fat regions which precisely capture this 
intuition: thick convex regions and wide polygons. We say R is (3 -thick if there exists a pair of 
concentric balls I, O with I C R C O and \0\ < /3\I\, see Figure [5(a) Let S > 1. A 5-corridor is a 



isosceles trapezoid whose slanted edges are at most 5 times as long as its base. A simple polygon 

10 P is 6 -wide if any isosceles trapezoid T C P whose slanted edges lie on the boundary of P is a 

11 (5-corridor [64], see Figure |5(b)P| Note that any r)-wide polygon R of constant complexity is also 



12 /3-thick, with j3 <E &($)• We will first solve the problem for convex thick regions, and then 

13 extend the result to non-convex wide polygons. Analogously to the ID case, we will store for each 

14 region R^TZa, representative point p that lies somehow "in the middle" of R. When the regions 

15 are /?-thick, we will use the center point of the two concentric disks from the thickness definition 

16 as representative point. We denote the set of representative points of the regions in 1Z by P. Let 

17 T be the quadtree built over P. We distinguish between true cells, which are necessary in any 

18 valid compressed quadtree over P, and P-cells, which may further subdivide a true cell and are 

19 only added in order to maintain balance. We store each representative point m in T according to 

20 the following rule: Let C v be the smallest quadtree cell containing m. If C v is a true cell, then m 

21 is stored in v. If C v is a P-cell, then m is stored in u, the lowest (not necessarily proper) ancestor 

22 of v in T such that \C U \ > |P|/(4/3). 

23 Several new problems are introduced which were not present in the ID case. We briefly sketch 

24 how to address each of these problems, and then present the complete solution. 

25 Linear distance. When performing a query in the one-dimensional case, the location in the 

26 quadtree of any intersecting region is at most a constant number of cells away. However, in the 

27 two-dimensional case, the location of an intersecting region may be up to a linear number of cells 

28 away, as shown in Figure [6(a) We solve this problem with some additional bookkeeping. Given 



29 a quadtree cell C q , we use two different strategies to locate regions intersecting C q depending on 

30 their size. All regions of size at least 2/3\C q \ will be located using a marked- ancestor data structure: 

31 an additional search structure which we explain in more detail below. All regions of size less than 

32 2/3|Cg| which intersect C q will register a bidirectional pointer with C q using the following tagging 

33 strategy. 

Many other notions of fatness exist in the literature. We chose to use thickness because it is basic and implied 
by most other definitions, and wideness because it will be convenient to use Theorem 14. 81 
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(a) (b) 



Fig. 6: (a) The intersecting region could be stored a linear distance from the query cell (containing the blue 
point), (b) The number of regions which can intersect quadtree leaf C is at most 0{6), since each 
region blocks a 0.(1/13) fraction of a large circle centered at C, by similar triangles. 



Let <7 be the smallest diameter of a quadtree cell such that d > \R\/(4(3). Let Sr be the set of 
quadtree cells C which intersect R and are either a leaf or have size |C| = d. All cells in Sr will 

3 be tagged with a pointer to R. Since the quadtree is balanced, given a pointer to any cell in Sr, 

4 we can locate all cells in Sr in 0(|S , /j|) time. By the following lemma, Sr must contain the cell 

5 containing the representative point of R. 



Lemma 4.1: Let R be a /3-thick region stored by our data structure. If C is the quadtree cell which 

\R[ 
4/3 • 



stores the representative point of R, then C has side length at least 



Proof: If C is a -B-cell, then the claim is true by construction. Suppose C is a true cell. Let m 

9 be the representative point of R. By the definition of thickness, there exists a disk ICR centered 

10 at m with |/| > \R,\/f3. I contains no representative points of regions other than R. Let C be the 

11 cell containing m. Note that if C contains m and is significantly smaller than \R\, then C must 

12 be completely contained in I. However, C must be the largest quadtree cell completely contained 

13 in I, since if the parent C of C in the quadtree is completely contained in R, then C would not 

14 have been further subdivided because C would contain no other points. Therefore, C must 

15 have some portion outside of I and must have size larger than |/|/2. Thus the size of C is at least 

16 \I\/4>\R\/(A/3). m 

17 Moreover, by the following lemma \Sr\ — 0(/3), and therefore, given the cell containing the 

18 representative point of R we can tag all cells in Sr in C(/3) time. 

19 Lemma 4.2: Let R be a /3-thick region stored in our data structure, and let C be quadtree cell 

20 that stores the representative point of R. Then there are at most 0((3) quadtree cells of size |C| 

21 required to cover R. 

22 Proof: Let / be the largest inscribed disk of R. The boundary of / touches the boundary of R 

23 in two or three points. If two points, then these are diametral on /, so R is contained in a strip 

24 of width |7|. If three points, then take the diametral points of these three points and take the 

25 strips of width |/| of these three pairs; R is contained in the union of these three strips. Now, if 

26 R is beta-thick, the portion of the strips it can be in is at most fi\I\ long. So, R can be covered 

27 by 0(/3) disks the size of I. Each such disk can be covered by at most 0(1) cells of size |C|, by 



28 Lemma 4.1 Thus, 0(f3) cells are required to cover R. 
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Linear overlap. In the onc-dimcnsional case, we store only the center points of our regions, and 
the number of regions that overlap any quadtree cell is at most three. In two dimensions, it 

3 appears that we may have a large number of small regions that intersect a quadtree cell. However, 

4 we show in the following lemma that this is not the case. 



Lemma 4.3: The number of /3-thick convex regions intersecting any balanced quadtree leaf is 0((3). 



Proof: Let Rc be the set of thick convex regions that intersect the boundary of leaf C, and let r 
7 be the radius of a large disk D containing all regions in Rc- For each region Rj G Rc there exists 
a disk Ij C Rj with center rrij such that \Ij \ > \Rj\/(3. Moreover, since each region Rj is convex, 
it must contain a triangle consisting of the diameter of Ij and some point pj G Rj C C. Each of 

10 the four sides of C can "see" at most irr of the perimeter of D. However, by a similar triangles 

11 argument each triangle must block the line of sight from one or more sides to at least Q(r/f3) of 

12 the perimeter (see Figure [6(b) [ ). Thus, since the regions are convex and disjoint, the number of 

13 regions in Rc is at most O(P). M 



\ 4.1 Definition of the data structure 



15 At the core, our data structure is similar to the one-dimensional data structure described above: 

16 we have a spacial tree, which allows for efficient updates, and a search tree, which allows for 

17 efficient searching over the quadtree. However, our data structure is augmented to address the 

18 problems introduced by the two-dimensional case. We maintain a dynamic balanced quadtree Q 

19 over P, which we augment to support mark and unmark operations and marked-ancestor queries, 

20 and we maintain a dynamic edge-oracle tree on the edges of Q. 



21 Marked-ancestor tree. Suppose we are given an angle <p which divides 2ty (i.e., k(f> — 2ir), and 

22 consider the set of angular intervals $i = [i<j>, (i + l)(f>] (modulo 2n), for integers 1 < i < k. For each 

23 quadtree cell C of Q with center point c, we define the wedge W c centered at c and with opening 

24 angle <f> to be the union of all halnines from c in a direction in Let Wc — {W c | 1 < i < k}; 

25 note that Wc partitions M 2 into k wedges. 

26 For each 1 < i < k, let Ti be a marked- ancestor structure on Q. We mark a cell C in T, if and 

27 only if there is a region R G 1Z of size 2/3\C\ < \R\ < 4/3|C| that intersects C, and such that the 

28 center point of R lies in W c - 

29 When doing a query, we will only look at the first marked ancestor in each T t . Lemma [475] captures 

30 the essential property of the regions which enables this strategy. First, we need the following claim. 



31 Claim 4.4: Let /3 be given and set <fi — jy^- Let C be a cell that is marked in Tj by a /3-thick 

32 region R. Let L l c be the set of lines that start in C, and have a direction in Then every line 

33 in L l n intersects R. 



34 Proof: Let m be the representative point of R. Since R is /3-thick, there exist disks / CJiCO 

35 centered at m with |0|/|/| : \ /3. Since R caused C to be marked, O, must intersect C, and m 



36 must lie in W c - See Figure 7(a) 



37 Now, we need that / intersects all lines in L' c . The distance from m to C is at most \\0\ < 1 

38 Then, the distance from rn to the far edge of W c is at most ^|J|sin(/), and the distance to the 

39 far edge of L' c is at most f |7] sin0 + \\C\. Since \R\ > 2/3\C\, we know that \C\ < Using 

40 4> = jifi implies /3sin0 < < ^. Combining these, we see that |/| > (3\I\ sin0+ |C|, so, / blocks 
4 all lines in L c . M 



Loffler et al., Dynamic Planar Point Location with Sub-Logarithmic Local Updates 



13 




42 Lemma 4.5: Let C\ be a cell that is marked in Ti by a convex and /3-thick region Ri, and let C2 

1 be a descendant of C\ that is marked in T L by a convex and £f-thick region R2. Then there cannot 

2 be a descendant C3 of C 2 that intersects 



Proof: Let R2 and i?i be convex fat regions which mark cells C'2 and C\ respectively. Then there 
is a point P2 € i?2 H C2. Suppose for contradiction that i?i intersects C3; that is, there exists a 
point pi e i?i n C 3 . Let r and s be two parallel rays from pi and p 2 m some direction «ie 
Note that rays r and s are both in L l c . Therefore each ray must intersect both R\ and R2 by 
Claim 4.4 Since each region R\ and R2 is convex, their intersection with each ray r (or s) is a 
single line segment, denoted T\ and r 2 (si and s 2 ) respectively. Moreover, since i?i and i?2 are 
disjoint, the segments n and r 2 (si and S2) are also disjoint (see Figure [7(b)] ) . 

10 Since pi € ri must come before r 2 on the ray r. Similarly, S2 must come before Si on the 

11 ray s. Moreover, i?i is convex, and thus the convex quadrilateral defined by ri,si is completely 

12 contained in Ri, and likewise r 2 ,s 2 C R 2 . These two quadrilaterals must intersect, which is a 

13 contradiction because i?i and R2 are disjoint. Therefore there is no point p\ £ Ri n C3. K 



14 4.2 Handling queries 

15 Given a query point q, we want to find out which region (if any) contains q. We begin by performing 

16 a point location query for q in the quadtree Q. By Lemma |2.6| we can find the leaf cell C in the 
quadtree which contains q in O(logn) time using the edge-oracle tree. 

18 By Lemma [4~3| there can only be 0(/3) regions which intersect C. All regions of size at most 2/3|C| 

19 will have tagged C with a pointer to themselves, and are immediately available from C. Moreover, 

20 we can find all regions of size at least 2/3 |C| in 0(/3logn) time by querying the marked-ancestor 

21 structures. We compare each region to our query point, and determine which region (if any) 

22 intersects the query point in 0(/3) time. Thus, we can answer the query in total time 0(/31ogn). 



23 4.3 Handling updates 

24 We only store the representative points of the regions in the quadtree. Thus, when performing a 

25 local update, it is sufficient to find the new location for the region's representative point, and then 

26 update the quadtree, tags, marked-ancestor trees, and edge-oracle trees accordingly. 
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27 Locating the new representative point. Given a pointer to a region R, we replace it by another 
1 region R! that is p-similar to R for any arbitrary parameter p > 1 . Let p and p' be the represen- 
tative points of R and R' , respectively. We find the leaf cell of Q containing p' by going up in the 
quadtreee until the size of the cell we are in is similar to the distance to p' , then using level-links 
4 to find the ancestor of p' of similar size, and then going back down. 

Lemma 4.6: The distance in Q between the leaf C containing p and the leaf C" containing p' is at 
6 most O(log0/3)). 



Proof: Recall that by definition, \R U R'\ < pmin{|i?|, and by Lemma 4.1 each region is 



stored in a quadtree cell proportional to its size, i.e. \C\ > jgf. Thus, |C| > l^jgJ-, and likewise 
for \C\' . Hence, to find C from C, we move up at most log(/3p) levels in the quadtree to find 

10 a cell of size fi(|i?U R'\), then follow O(l) level-link pointers to find a large cell containing p'. 

11 Finally we move down at most \og(/3p) levels to find C, 

12 Updating the quadtree. We must also update the quadtree to reflect the new position of the 



13 representative point. By Lemma 2.2 we can delete p, insert // , and perform the corresponding 



I 4 rebalancing of the quadtree in 0(1) worst case time. 

15 Updating the auxiliary structures. A local update replaces an old region R by a new region R' 

16 which is p-similar to R, but may overlap different quadtree cells than R. Therefore we may require 

17 updates to the marked-ancestor structure. Let C be the quadtree cell containing i?'s representative 

18 point. After the update, R' must only intersect 0(13) quadtree cells which are similar in size to 



19 C by Lemma 4.2 For each of these cells, we test the direction of the representative point of R' 

20 and mark it in the corresponding marked-ancestor tree. We also unmark cells which corresponded 

21 to the old region R. These updates can be performed in C(loglog?i) time per marked-ancestor 

22 structure. We must also remove tags from all cells in Sr and add tags to cells in Sri. However, 



23 given C and C , this takes 0(/3) time by Lemma 4.2 By Lemma 2.6 we can also update the 

24 edge-oracle tree in 0(1) time. 

25 Theorem 4.7: A set of n disjoint convex 1 hick objects of constant combinatorial complexity in 

26 M 2 can be maintained in a 0(f3n) size data structure that supports insertion, deletion and point 

27 location queries in 0(/31ogn) time, and p-similar updates in 0(/31oglogn + log(/3p)) time. All 

28 time bounds are worst-case, and the data structure can be implemented on a real-valued pointer 

29 machine. 



30 4.4 Non-convex regions 

31 We can extend the result to non-convex fat regions, by cutting them into convex pieces. This ap- 

32 proach only works for polygonal objects, since non-polygonal objects cannot always be partitioned 

33 into a finite number of convex pieces. For polygonal objects, we use a theorem by van Kreveld: 

34 Theorem 4.8 (from [64j): A d-wide simple polygon P with n vertices can be partitioned in 0(n log 2 n 

35 time into 0(n) ,5- wide quadrilaterals and triangles, where (3 = min{5, 1 — |v3}. 

36 We conclude: 

37 Theorem 4.9: A set of n disjoint polygonal <5-wide objects of constant combinatorial complexity in 

38 M 2 can be maintained in a 0(Sn) size data structure that supports insertion, deletion and point 
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39 location queries in (3(<51ogn) time, and p-similar updates in O (Slog log n + \og(Sp)) time. All 
time bounds are worst-case, and the data structure can be implemented on a real-valued pointer 
2 machine. 

Note that a, /3-covered objects are 0(min{a, /3})-thick and polygonal a,/3-covered objects are 
4 0(min{a, /3})-wide, so our results apply to such objects as well. 

5 Discussion 

6 We have shown that we can maintain a set of intervals in R 1 or disjoint fat regions R 2 in a data 
structure that supports O(logn) point location queries, and local updates in R 1 in O(l) time and 
in R 2 in O (log log n) time respectively. These results are the first of their kind in a geometric 

9 setting. Still, several gaps remain, and there are many open problems left for future research. 

10 We show that the fatness restriction is necessary given our current definition of locality. However, 

11 for non-fat objects, the definition seems to be too powerful: if all regions are skinny but homothetic, 

12 for example, we could solve the problem simply by scaling the plane in one direction. As soon as 

13 the regions have different orientations, however, this simple solution no longer works. It would 

14 be interesting to investigate alternative, more restrictive definitions of similarity that capture this 
effect, and analyze to what extend local updates on non-fat objects can then be supported. 

16 Also, it is unclear whether the disjointness condition is necessary. While the restriction is very 

17 natural in applications where the regions represent physical objects, it would be useful to be able 

18 to handle some restricted amount of overlap when the regions represent imprecision. However, it 

19 appears to be hard to extend our approach in this setting: even simply keeping a constant number 

20 of copies of our data structure does not work, because now one needs to assign regions to layers 

21 on the fly, which appears to be non-obvious. 

22 Finally, perhaps the most intriguing question left open regards the update complexity itself. While 

23 the 0(log log n) update time in the 2-dimensional case is sublogarithmic, it is not clear whether this 

24 is the right bound, or whether constant time updates might be possible, as in the 1-dimensional 

25 case. 
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42 A Model of Computation 

1 We wish to store regions described by arbitrary real numbers in our data structure. In compu- 
tational geometry, the standard model of computation is the real RAM model. A real RAM is 
a random access machine with additional support for real number arithmetic. In particular, one 
works with an abstract machine with an array of memory cells, each of which can either store a 
single real number, or an integer. One is allowed to perform basic algebraic operations on real 
numbers in constant time, and to do integer arithmetic and use integers as address pointers as 
on a standard random access machine. Additionally, one sometimes allows conversion from real 
numbers to integers (e.g., using a floor operation): this is justified by the fact that in practice, 

9 real numbers are approximated by floating-point numbers on which the floor operation is trivial 

10 to execute, but controversial because it breaks the internal consistency of the computation model. 

11 Similar to the real RAM, we may consider a real- valued pointer machine, which is a pointer ma- 

12 chine with additional support for real number arithmetic. Like the real RAM, it has memory cells 

13 which store real numbers or integers; however, here the integers cannot be manipulated at all, 

14 they only function as abstract "pointers" to other memory cells. 

15 In our data structure and the associated algorithms, we need to be able to compare real numbers to 

16 integers. Furthermore, to build a quadtree, we need an operation that, given a set of real numbers, 

17 provides us with an interval that contains all numbers in the set, and whose length is approximately 

18 the difference between the largest and smallest numbers in the set. On limited-precision machines 

19 supplied with a floor operation, we can easily find the smallest interval containing the numbers 

20 whose length and end points are powers of 2, and use this to keep the quadtree aligned with the 

21 number system of the machine. In the description of our results, we assume that this is the case. 

22 However, if we are not able to convert real numbers to integers, as we would not be on a pure 

23 real RAM or real-valued pointer machine, we can also simply return the interval spanned by the 

24 smallest and largest element of such a set, and use real arithmetic to subdivide the interval and 

25 construct a quadtree. For this, we additionally need to be able to compare real numbers to each 

26 other, to add and subtract them, and to divide them by 2. In Appendix |C.1| we describe how to 

27 deal with compressed quadtrees on a pure real-valued pointer machine, in which no floor operation 

28 is available. All other machinery operates on the combinatorial tree. We do need to manipulate 

29 integers (i.e., pointers) in order to use the marked-ancestor data structure by Alstrup et al. [2]. 
In Appendix |C.2| we describe how to adapt this structure to a pointer machine, at the cost of an 

31 increase in query time (but since our queries are dominated by point location anyway, this does 

32 not affect our final result). 

33 B Lower Bounds 

34 In this section, we will investigate lower bounds on updates. Clearly, there cannot be any non- 
35 trivial lower bounds if we do not restrict the time we allow to spend on queries, so we will 

36 restrict our attention to data structures that support 0(logn) queries. We will first argue that 

37 insertions must take SI (log n) time, and then extend the argument to show that updates cannot be 

38 implemented any faster unless they are local. Finally, we show that some of the restricted settings 

39 we use are necessary. 

40 B.l Insertions and deletions 

41 The relationship between preprocessing time, insertion time, and query time in dynamic data 

42 structures is well-studied. Borodin et al. [13] first showed that if membership queries in an or- 

43 dered set need to be supported in sublinear time, then insertions must necessarily take SI (log n) 

44 comparisons. The seminal paper by Ben-Or [S], relating the height of a computation tree to the 

45 connected components in the space of possible inputs to a problem, made it possible to make the 
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46 same argument in algebraic computation trees. We base our lower bounds on a reduction to the 
1 semi-dynamic membership problem, which was shown by Brodal and Jacob |14j to have a fi(logn) 
lower bound for queries and insertions on a Real RAM. 

Theorem B.l (from [M]): Let V be & data structure that maintains a set 5 of n real numbers that 
supports insertions in I{n) time and membership queries in Q(n) time. Then we have I(n) = 

From this result, we easily obtain a fi(logrt) lower bound on insertions for our problem. 

Corollary B.2: Let D be a data structure that stores a set TZ of n regions in M. d , and allows for 
point location queries in Q(n) time and insertions/deletions in I(n) time. If Q(n) — o(n), then 
9 I(ri) = fi(logn). 



10 B.2 Local updates 

11 To obtain lower bounds on the complexity of local updates, the standard approach does not work 

12 directly. After all, every element that gets moved locally must have been inserted before, so in any 

13 static argument involving n elements we already need to spend fi(nlog?i) time just to initialize 

14 the structure. Instead, we will argue that when the local updates are sufficiently powerful, we may 

15 start with a data structure that already contains n elements, and use the local updates to simulate 

16 insertions. If we identify an invariant on the elements and show that it is maintained after the 

17 updates, we can simulate arbitrarily many rounds of insertions, and their processing time can no 

18 longer be charged to the initial (true) insertions into our data structure. 

19 Lemma B.3: Let D be a data structure that stores a set TZ of n regions in M d , and allows for point 

20 location queries in Q(n) time and updates in U(n) time. Let K be a set of regions on which 

21 there exists some order O : TZ — >• N. Suppose that for any permutation tt of n elements, there 

22 exists a sequence of 0(n) updates S„ that turns TZ into TV such that 0(TZ) — ir(0(TZ')). Then if 

23 Q{n) = o(n), U(n) = fi(logn). 

24 The above lemma is a fairly straightforward consequence of [T5] , 



25 Unbounded moving. We first show that if we only allow to move regions (not scale them), but 

26 have no bound on the distance they may move, we still have a SI (log n) lower bound. 

27 Lemma B.4: Let D be a data structure that stores a set TZ of n disjoint regions in M c/ , and allows 

28 for point location queries in Q(n) time, insertions in I(n) time, and move updates in U(n) time. 

29 If Q(n) = o(n), then U(n) = ft(logn). 



30 Proof: Let X be the set of intervals {/; = [i, i + 1) | i = 1, . . . , n}, and let Ij be I translated by 

31 jn. Given any permutation tt on n elements, there is clearly a sequence of n move updates that 

32 takes the elements of Ij and turns them into 7r(2j +1 ): every element can move directly to its new 



location. Therefore, by Lemma B.3 we must have U(n) — f2(logn). Since all intervals of Ij UXj+i 

34 are disjoint, no interval will overlap any other interval during the execution of the updates, and 

35 we maintain a ply of 1. S3 
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36 Unbounded scaling. If we restrict moving distances but allow full freedom in the precision 
changes, and allow the ply to become 2, then the above argument can be trivially adapted: We can 
2 permute the set of intervals by first grow each interval large enough to contain the whole domain 
of interest, and then shrink it to its new location. We now show that even if we insist on disjoint 
intervals, there is still a Q(\ogn) lower bound on unrestricted scaling, even if we only either grow 
or shrink the intervals. 



Lemma B.5: Let D be a data structure that stores a set I of n intervals in M 1 and allows for 
point location queries in Q(n) time and updates in U(n) time subject to the following restrictions: 
8 No more than one interval is allowed to overlap any point; An update may replace interval I by 
interval /' that is within distance 2|/|, and has size < |/'| < 2\I\ (i.e. the interval can shrink 
10 arbitrarily). Then, if Q(n) = o(n), U(n) = fi(logn). 



11 Proof: Let 1 = {/; = [2 1 , 2 i+1 ) | i — 1, . . . , n}. That is, the intervals have their left endpoints 

12 aligned on powers of 2, have exponentially increasing size, and do not intersect each other. Let Ij 

13 be X, scaled down by a factor 2 jn . Then in a local update any interval from Ij can be mapped 

14 to any interval in Ij+\. Therefore, in n updates which never cause the ply to exceed one, the 



15 order of the intervals can be permuted arbitrarily. Thus, by Lemma B.3 if Q(ri) = o(n), then 

16 U(n) =0(logn). M 

17 Note that by reversing the direction of the updates, the same argument holds for arbitrary growing 

18 without shrinking. 



19 Unbounded skinniness. When d > 1, we additionally require the regions to be fat. Without this 

20 requirement, it is not obvious how one should define similarity of regions. Using the definition 
from Section [l.l | we can easily adapt the above interval constructions to skinny rectangles. 



22 Lemma B.6: Let D be a data structure that stores a set 1Z of n disjoint regions in M. d , d > 2, and 

23 allows for point location queries in Q(n) time, insertions in I(n) time, and similar updates in U{n) 

24 time. If Q(n) — o(?i), then U(n) — O(logn). 



25 Proof: Let I be the set of intervals as constructed in Lemma IB. 41 Extend the intervals to a set 

26 of rectangles 1Z — I x [0, n]. Now all elements in 1Z have diameter bigger than n. Every update in 

27 the proof of Lemma |B.4| moves an interval over a distance of at most 2rt; clearly, the corresponding 

28 update of the rectangle is 3-similar. M 

29 On the other hand, if all regions are convex and homothctic as in the proof above, then they can 

30 be made fat by simply scaling the plane. It would be interesting to investigate alternative, more 

31 restrictive definitions of similarity that capture this effect, and analyze to what extent On the 

32 other hand, if all regions are convex and homothetic as in the proof above, then they can be made 

33 fat by simply scaling the plane. It would be interesting to investigate alternative, more restrictive 

34 definitions of similarity that capture this effect, and analyze to what extend local updates on 
non-fat objects can then be supported. 



36 Unbounded ply. If we allow the regions to overlap arbitrarily, then clearly a single update can 

37 cause a linear number of changes to a subdivision in the plane. Thus, no method which explicitly 

38 maintains the regions will be able to handle such updates. Moreover, it seems impossible to 

39 maintain the set of regions implicitly without requiring some sort of hierarchical subdivision of 

40 the regions, which would then require updates to take at least Sl(logn) time. 

41 Similarly, the update complexity may depend on the current ply of the regions (that is, the number 

42 of regions which intersect a common sub-region.) If all regions contain a common interior, then 
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43 in 0{n) shrink operations we can permute them arbitrarily within the common region, which 
1 again implies a f2(logn) lower bound. However, even if we restrict the ply, and even in the 
one-dimensional case, we have a fi(logrt) lower bound for updates if we want to allow arbitrary 
3 shrinking. 

C Extensions 

We show how to extend our data structure so that 

• Our compressed quadtree does not require the floor operation. 

• The marked- ancestor component can be implemented on a pointer machine. 

C.l Arbitrary scales and compressed quadtrees 

In this paper, we assumed that compressed nodes in a quadtree are aligned with their parents. 

10 However, aligning a node at an arbitrary scale is not supported in constant time on a Real RAM, 

11 unless we can use the floor operation (or a different non-standard operation [391 Chapter 2]). 

12 While this is a very natural assumption in practice and does not hinder the implementation of 

13 our algorithms, it also is "unreasonably powerful" in theory, so we would like to avoid its use to 
strengthen our theoretical bounds. 

15 A standard way to avoid this problem in the literature is to allow compressed nodes to be associated 

16 with any square that is contained in the parent square and sufficiently small [16, 39, 49 . This is 

17 fine in a static context, but in our dynamic quadtrees we have to be more careful: after a number 

18 of merge operations the size difference between a compressed node and its parent may become less 

19 than a factor a, and then we cannot simply connect the two trees since they are not aligned. 

20 However, in a compressed quadtree with non-aligned compressed nodes we can still align nodes 

21 when necessary in 0(1) amortized time, which we now show. We will view each compressed node 

22 as a cut, which divides its ancestors and descendants into different components which may not be 

23 aligned with each other. Let n be the number of nodes in the quadtree and let N t be the number 

24 of nodes in component i. We define our potential function for each component as 

$i = JV((logn-logJV() 

25 and the total potential function as J^i ^i- 

We now analyze the cost of local or global update operation. 

• insert into existing component: 

The insertion takes O(logn), and adds 0(1) nodes to this component of the quadtree. For 
each node we add, we increment n and AT by 1. Therefore the change in potential of the 
component is 

A$ = -JVi(logn - log Aj) + (N t + l)(log(n + 1) - log^ + 1)) 

= A 4 (log n + 1 - log Ni ±±) + (log(n + 1) - log(JV, + 1)) 
n A l 

= log(n + 1) + negative terms 

= O(logn) 

27 Therefore, the total amortized cost is 0(logn + A$) = O(logn). 
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• insert into new component: 

An insertion may create a new compressed node. In this case, we create the corresponding 
component, and for each of the O(l) nodes created in the new component, we have the 
following change in potential: 

A5> = 1 ■ (logra - log 1) = O(logn) 

28 Therefore the total amortized cost in this case is also O(logrc). 

• merge 2 components: 

If the size difference between the two components becomes less than a factor of a, then they 
must be merged. We must make sure that the two components are aligned, and so we spend 
O(l) time for each node in the smaller component to align them with the larger component. 
Let N s be the number of nodes in the smaller component, Nl be the number of nodes in 
the larger component and N — N s + Nl be the total number of nodes in both components. 
Note that N > 2N S . The change in potential for these two components is 

A$ = -A s (logn - log7V s ) - Ax, (log n - \ogN L ) + A (log n - log A) 
= N s log N s + N L log N L - A log A 
= N s (log A s - log TV) + N L (log N L - log A) 

A 

< A s (-log2)+A £ (-log— ) 

< —N s 

Therefore the amortized cost of merging the two components is 0(N S — N s ) — O(0). 

• deletion: 

If we delete a node out of a component containing N nodes, then the change in potential is 

A$ = -TV (log n - log iV) + (A - l)(log(n - 1) - log(7V - 1)) 

A n 
= N\og — — - TV log - (log(n - 1) - log(A - 1)) 

iV — 1 71—1 

<0(1) 

Therefore the amortized cost of a deletion is 0(\ogn + 1) = O(logn). 

• local updates: 

Local updates do not change the total number of nodes, and move at most 0(1) nodes from 
1 component to another. Therefore, the change in potential of a local update is 0(1), and 
the amortized cost is (9(loglogn + 1) = O(loglogn). 

Lemma C.l: We can align compressed subtrees by the time they are connected in O(l) amortized 
time per split or merge operation. 



C.2 Marked-ancestor queries on a pointer machine 

10 We now show how to adapt the marked-ancestor structure of Alstrup et al. [3J [3] so that it works 
on a pointer machine. 

12 Suppose that we are given a tree T over which we want to support marked ancestor queries. 

13 Recall that a heavy node is a node with at least two children. Alstrup et al. maintain what they 

14 call an ART-universe. That is, they partition the nodes of T into micro-trees such that each 

15 micro tree has at most O(logn) heavy nodes, and any leaf to root path passes through at most 

16 0(logn/loglogrt) micro trees. Thus, they reduce any marked-ancestor query in T to at most 
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17 0(logn/loglogrt) exists queries on the micro trees which determine if each micro tree on the 

1 path to the root contains a marked ancestor and one marked-ancestor query in the first micro-tree 

2 which contains a marked ancestor. The final marked- ancestor query in the micro tree is answered 
by determining which of the at most O(logn) paths in the micro tree contains a marked ancestor, 
and then performing a marked successor query on that path. 

The reduction from queries in T to queries in micro-trees only requires a pointer machine. However, 
6 they require a word-RAM to support their queries within micro-trees in two places. First, they 

maintain connectivity between the O(logn) paths within a micro-tree using the bit-manipulation 
8 techniques of [3]. Second, they use a RAM based implementation to support their marked successor 

queries on the marked path. Thus, if we replace these two data structures, we will support all 

10 operations on a pointer machine. 

11 The latter data structure is easy to replace. We just use a pointer- machine based implementation 

12 of a Union-Split-Find data structure [3H ED [53 HI] to support the marked successor queries on a 
path. We now describe how to replace the former data structure. 

14 We keep the same subdivision of a micro-tree into O(logn) paths, but instead of using bit- 

15 manipulations to keep track of the O(logn) paths, we build a tree on the paths. By construction, 

16 each path does not contain any heavy nodes in its interior. Therefore, we can compress each 

17 path in the micro-tree to a single node representing the path, where each compressed-path- node 

18 is marked if and only if at least one node on the corresponding path is marked. The result is a 

19 tree with a logarithmic number of nodes. Over our path-node-tree, we build the Link-Cut data 

20 structure of Sleater and Tarjan [63] , which maintains a dynamic forest and supports operations 

21 link, cut, and find-root in 0(logN) time, where N is the number of nodes in the forest. Just as 

22 Union-Split-Find is equivalent to the marked successor problem, the link-cut trees support all the 

23 operations required for the marked-ancestor problem. The Link operation corresponds to unmark, 

24 and the Cut operation corresponds to the mark operation. Likewise, the find-root operation, which 

25 returns the root of the current tree corresponds to the marked-ancestor query. Since the number 

26 of nodes in our path- node-tree is N = O(logn), this data structure supports all marked-ancestor 

27 operations on the path-node-tree in O(loglogn) time. 

28 Thus, all of the components of the data structure are now supported on a pointer machine. To 

29 perform a query in T, we perform at most O (log n/ log log n) marked ancestor queries in the 

30 micro-trees. When we reach the first marked path-node in a micro-tree, we also perform a marked 

31 successor query on this path, and the returned node is the first marked ancestor in T. Since 

32 the time spent in each micro-tree is at most O (log log n), the total time required for a query in 

33 O(logra). 

34 To perform a mark/unmark update of a node v G T, we perform the corresponding update on 

35 the path P containing v. If this is/was the only marked node in P, then we also update the 

36 corresponding path node up in the link-cut data structure containing up. Thus we update a 

37 constant number of data structures, and each update takes O(loglogn) time. 

38 Lemma C.2: We can maintain a data structure over any rooted tree T which supports insertions 

39 and deletions of leaves in O(l) amortized time, marking and unmarking nodes in O(loglogn) 

40 worst-case time, and queries for the lowest marked ancestor in O(logn) worst-case time. All 
operations are supported on a pointer machine. 



